Goto

Collaborating Authors

 trace norm regularization


41ae36ecb9b3eee609d05b90c14222fb-Reviews.html

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Please also discuss on how this can be extended to the analysis of ADMM. This paper is an extension of Tseng [20], Tseng and Yun A coordinate gradient descent method for nonsmooth separable minimization and Zhang et al. [22], which established the same result using the error-bound condition for lasso and group lasso, to the trace norm. This is a non-trivial extension but the contribution seems purely technical. The presentation of the proofs is mostly clear.


A New Convex Relaxation for Tensor Completion

Neural Information Processing Systems

We study the problem of learning a tensor from a set of linear measurements. A prominent methodology for this problem is based on the extension of trace norm regularization, which has been used extensively for learning low rank matrices, to the tensor setting. In this paper, we highlight some limitations of this approach and propose an alternative convex relaxation on the Euclidean unit ball. We then describe a technique to solve the associated regularization problem, which builds upon the alternating direction method of multipliers. Experiments on one synthetic dataset and two real datasets indicate that the proposed method improves significantly over tensor trace norm regularization in terms of estimation error, while remaining computationally tractable.


On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization

Neural Information Processing Systems

Motivated by various applications in machine learning, the problem of minimizing a convex smooth loss function with trace norm regularization has received much attention lately. Currently, a popular method for solving such problem is the proximal gradient method (PGM), which is known to have a sublinear rate of convergence. In this paper, we show that for a large class of loss functions, the convergence rate of the PGM is in fact linear. Our result is established without any strong convexity assumption on the loss function. A key ingredient in our proof is a new Lipschitzian error bound for the aforementioned trace norm-regularized problem, which may be of independent interest.


Trace Lasso: a trace norm regularization for correlated designs

Neural Information Processing Systems

In this paper, we introduce a new penalty function which takes into account the correlation of the design matrix to stabilize the estimation. This norm, called the trace Lasso, uses the trace norm of the selected covariates, which is a convex surrogate of their rank, as the criterion of model complexity. We analyze the properties of our norm, describe an optimization algorithm based on reweighted least-squares, and illustrate the behavior of this norm on synthetic data, showing that it is more adapted to strong correlations than competing methods such as the elastic net.


A New Convex Relaxation for Tensor Completion

Neural Information Processing Systems

We study the problem of learning a tensor from a set of linear measurements. A prominent methodology for this problem is based on a generalization of trace norm regularization, which has been used extensively for learning low rank matrices, to the tensor setting. In this paper, we highlight some limitations of this approach and propose an alternative convex relaxation on the Euclidean ball. We then describe a technique to solve the associated regularization problem, which builds upon the alternating direction method of multipliers. Experiments on one synthetic dataset and two real datasets indicate that the proposed method improves significantly over tensor trace norm regularization in terms of estimation error, while remaining computationally tractable.


Trace Lasso: a trace norm regularization for correlated designs

Neural Information Processing Systems

Using the \ell_1 -norm to regularize the estimation of the parameter vector of a linear model leads to an unstable estimator when covariates are highly correlated. In this paper, we introduce a new penalty function which takes into account the correlation of the design matrix to stabilize the estimation. This norm, called the trace Lasso, uses the trace norm of the selected covariates, which is a convex surrogate of their rank, as the criterion of model complexity. We analyze the properties of our norm, describe an optimization algorithm based on reweighted least-squares, and illustrate the behavior of this norm on synthetic data, showing that it is more adapted to strong correlations than competing methods such as the elastic net.


Trace norm regularization for multi-task learning with scarce data

Boursier, Etienne, Konobeev, Mikhail, Flammarion, Nicolas

arXiv.org Machine Learning

Multi-task learning leverages structural similarities between multiple tasks to learn despite very few samples. Motivated by the recent success of neural networks applied to data-scarce tasks, we consider a linear low-dimensional shared representation model. Despite an extensive literature, existing theoretical results either guarantee weak estimation rates or require a large number of samples per task. This work provides the first estimation error bound for the trace norm regularized estimator when the number of samples per task is small. The advantages of trace norm regularization for learning data-scarce tasks extend to meta-learning and are confirmed empirically on synthetic datasets.

  multi-task learning, scarce data, trace norm regularization
2202.06742
  Genre: Research Report (0.40)

Trace Lasso: a trace norm regularization for correlated designs

Grave, Edouard, Obozinski, Guillaume R., Bach, Francis R.

Neural Information Processing Systems

Using the $\ell_1$-norm to regularize the estimation of the parameter vector of a linear model leads to an unstable estimator when covariates are highly correlated. In this paper, we introduce a new penalty function which takes into account the correlation of the design matrix to stabilize the estimation. This norm, called the trace Lasso, uses the trace norm of the selected covariates, which is a convex surrogate of their rank, as the criterion of model complexity. We analyze the properties of our norm, describe an optimization algorithm based on reweighted least-squares, and illustrate the behavior of this norm on synthetic data, showing that it is more adapted to strong correlations than competing methods such as the elastic net. Papers published at the Neural Information Processing Systems Conference.


A New Convex Relaxation for Tensor Completion

Romera-Paredes, Bernardino, Pontil, Massimiliano

Neural Information Processing Systems

We study the problem of learning a tensor from a set of linear measurements. A prominent methodology for this problem is based on the extension of trace norm regularization, which has been used extensively for learning low rank matrices, to the tensor setting. In this paper, we highlight some limitations of this approach and propose an alternative convex relaxation on the Euclidean unit ball. We then describe a technique to solve the associated regularization problem, which builds upon the alternating direction method of multipliers. Experiments on one synthetic dataset and two real datasets indicate that the proposed method improves significantly over tensor trace norm regularization in terms of estimation error, while remaining computationally tractable.


On the Linear Convergence of the Proximal Gradient Method for Trace Norm Regularization

Hou, Ke, Zhou, Zirui, So, Anthony Man-Cho, Luo, Zhi-Quan

Neural Information Processing Systems

Motivated by various applications in machine learning, the problem of minimizing a convex smooth loss function with trace norm regularization has received much attention lately. Currently, a popular method for solving such problem is the proximal gradient method (PGM), which is known to have a sublinear rate of convergence. In this paper, we show that for a large class of loss functions, the convergence rate of the PGM is in fact linear. Our result is established without any strong convexity assumption on the loss function. A key ingredient in our proof is a new Lipschitzian error bound for the aforementioned trace norm-regularized problem, which may be of independent interest.